Ken Human 


John Stealey 

APOLLO, OQLUMBDA CL3ALLELSOEB 

OASE STUODES 



Introduction 

® Overview 

• Why we're here 

• Organizational "culture" leadership and the model 
technical organization 

• Rules of engagement 

® Case Study Discussion 

• Apollo 13 

• Challenger 

• Columbia 

® Conclusion 




Key Concepts 


® Technical Competence versus 
Bureaucratic Process 

® Schedule Pressure versus Safety as a 
Priority 

® Normalization of Deviance 

® Suppressing versus Encouraging Dissent 

® The Role of Data in Decision Making 

® Attributes of a Model Technical 
Organization 




Apollo 13 

Launched April 11, 1970 



Apollo 13 


® Physical Cause 

• Cryogenic tank heater circuit design flaw; 

• Teflon insulation damaged during pre-launch 
testing; 

• Bare copper wires in the tank were 
submerged in liquid oxygen during servicing; 

• On day 3 of mission, during cryo tank 2 stir, a 
spark jumped between wires of the heater 
circuit. 



Apollo 13 

® Apollo Culture 

"Okay, listen up. When you leave this room, you must 
leave believing that this crew is coming home. I don't 
give a damn about the odds and I don't give a damn 
that we've never done anything like this before. Flight 
control will never lose an American in space. You've got 
to believe, your people have to believe, that this crew 
is coming home. Now let's get going." 

Gene Kranz 
Lead Flight Director 
April 11, 1970 




Challenger, STS-51L Launched 

January 28,1986 



Challenger, STS-51L 


® Physical Cause - Solid Rocket Booster 
Field Joint Design Deficiency 

• Cold temperature reduced ability of O-ring to 
seal Field Joint 

• Exhaust gas leaked from Right SRB 

• Weakened P-12 Strut failed 

• SRB rotated into External Tank 

• Structure broke apart, Orbiter rotated, 
exceeded structural limits 




Challenger, STS-51L 


® Organizational Cause 

• Lack of safety emphasis regarding doubts 
about the SRB joint seal 

• Launch constraint waivers at the expense of 
safety and not reviewed by all levels of 
management 

• Internal problem resolution focus rather than 
communicating externally 

• Contractor management overruled their own 
engineers to accommodate its customer 




Normalization of Deviance 


® When 1977 tests indicated some joint opening, 
contrary to joint designers' expectations, a sealing 
putty "fix" was added, and the anomaly was 
considered an "acceptable risk." 

® When a 1981 launch resulted in blow-by through 
the putty, this anomaly was explained as a result of 
improperly applied putty. 

® When 1984 and 1985 launches caused more 

leakage, this leakage had come to be expected, and 
acceptable. 




Study Guide 


Challenger Case Study 


Statistical Analysis of O-ring Failure Data Available Prior to Challenger 

M, scatter Plot of probe vs temper 


++ +*++++* 
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. .with the benefit of hindsight, it can be seen that the Challenger disaster was not at all 
surprising, given data that were available at the time of the flight. As a result of its 
investigations, one of the recommendations of the commission was that a statistician be 
part of the ground control team from that time on.” 


The Flight of the Space Shuttle Challenger, 1 Jeffrey S. Simonoff, 1999. 
http://www.stanford.edu/class/stat2Q1/readina/challloa.pdf 





Columbia, STS- 107 

Launched January 16, 2003: 
Deorbited February 1, 2003 
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Columbia, STS- 107 

® Physical Cause of Accident 

• Ascent 

o Insulating foam separated from the left bipod ramp 
of the External Tank (81.7 seconds after launch) 

o Breached leading edge of the left wing 

• Entry 

o Superheated air penetrated, progressively melted 
the aluminum, weakened structure; increasing 
aerodynamic forces caused loss of control, failure of 
the wing, and breakup 
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Columbia, STS- 107 



fl S ore 6.T-6. This chorf shows fhe number of t/irtgs greeter than one inch fn cfiamefer on the lower surface of fhe Orbifer offer each mission 
front STS 6 through STS- 1 7 3. Fh'ghts where the bipod ramp Zoom is known to ho^e come off ore market/ w/fh o reef triang/e. 








Flights with Significant TPS Damage (14) 


MISSION 

DATE 

COMMENTS 

ST5-1 

April 12, I9fil 

Lots of debris damage. 300 Mies- replaced. 

STS-7 

June 18, 1983 

Flrsl known leFt bipod romp foam shedding event. 

STS-27R 

December 2, 1903 

Debris knocks off tile; structural damage ond near bum through results. 

ST5-32R 

January 9, 1990 

Second known left bipod romp foam event. 

STSOS 

December 2, 1990 

Flrsl time NASA -colts Foam debris "saFety o-F flight issue/ and "re-use or turn- 
around issue." 

ST542 

January 22^ 1992 

First mission aFter which Ihe next mission (STS-45] launched without debris In- 
F lig ht Ana maty c los ure/reso 1 utio n. 

ST545 

March 24, 1992 

Damage lo wing IRCC Panel 1 0-rig hi. Unexplained Anomaly,. "most likely orbital 
debris." 

STSJO 

June 25, 1992 

Third known bipod romp Foam event. Hazard Report 37: an "accepted risk.* 

STS52 

Oclober 22, 1992 

Undetected bipod ramp Foam toss (Fourth bipod event). 

ST5-S& 

April B, 1993 

Acreage tile damage (large area]. Called "within experience base" and consid- 
ered "in family." 

ST5-62 

Oclober 4, 1 994 

Undetected bipod ramp Foam loss (Fifth bipod event). 

STSk87 

November 19, 1997 

Damage lo Orb iter Thermal Protection System spurs NASA to begin 9 flight 
tests to resolve Foam-shedding. Foam fix ineffective. In-Flight Anomaly eventually 
dosed after STS- 101 as "accepted risk." 

STS-112 

Oclober 7, 2002 

Sixth known left bipod ramp Foam loss. First time major debris event not assigned 
an InrF light Anomaly. External Tank Project was assigned an Action. Not closed 
out until after STS-113 and STS- 1 07. 

ST5-107 

January 1 6 r 2003 

Columbia launch. Seventh known leFt bipod ramp Foam loss event. 




Briefing Slide from STS-113 FRR 

(launched November 2002) 

® STS-112/ET-115 Bipod Ramp foam loss 

Missing foam on -Y Bipod Ramp - Picture 

• Issue 

O Foam was lost on the STS-112/ET-115 -Y bipod ramp (~4" X 
5" X 12") exposing the bipod housing SLA closeout 

• Background 

o ET TPS Foam loss over the life of the Shuttle Program has 
never been a "Safety of Flight" issue 

o More than 100 External Tanks have flown with only 3 
documented instances of significant foam loss on a bipod 
ramp 



Briefing Slide from STS-113 FRR 

(launched November 2002) 

® Rationale for Flight 

• Current bipod ramp closeout has not been changed since STS-54 

• The Orbiter has not experienced "Safety of Flight" damage from loss of foam 
in 112 flights (including 3 known flights with bipod ramp foam loss) 

• There have been no design / process / equipment changes over the the last 60 
ETs (flights) 

• All ramp closeout work (including ET-115 and ET-116) was performed by 
experienced practitioners (all over 20 years experience each) 

• Ramp foam application involves craftsmanship in the use of validated 
application processes 

• No change in Inspection / Process control / Post application handling, etc 

• Probability of loss of ramp TPS is no higher/no lower than previous flights 

• The ET is safe to fly with no new concerns (ancj no added_ risk} 


17 




Columbia, STS- 107 



Figure 6.3-J. T/ie small cylinder af fop i/fusfrafes fhe size of debris Crafer was mfended fo analyze. The 
larger cylinder was used for ffie STS ? 07 analysis; f/ie b/ock af rigbf is f/ie esfimafed size of fhe foam. 
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Review of Test Data Indicates Conservatism for Tile Penetration 


® The existing SOFI on tile test data used to create Crater 
was reviewed along with STS-87 Southwest Research data 

• Crater over predicted penetration of tile coating significantly 
o Initial penetration to described by normal velocity 

• Varies with volume/mass of projectile (e.g., 200ft/sec for 3cu. In) 

o Significant energy is required for the softer SOFI particle to penetrate 
the relatively hard tile coating 

• Test results do show that it is possible at sufficient mass and 
velocity 

o Conversely, once tile is penetrated SOFI can cause significant damage 

• Minor variations in total energy (above penetration level) can cause 
significant tile damage 

• Flight condition is significantly outside of test database 

o Volume of ramp is 1920cu in vs 3 cu in for test 
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Crigiridl Message 

From: STTCH, J, S, {STEVE) (J5C-DA3) {NASA) 

Sent: Thuisduy. January 23 r 2003 11:13 PM 

To: CDR: PLT 

Cc: HECK, KELLY B. (JSC-DAB) (NASA): ENGELAUF, PHILIP L, (J5C-DA6) {NA5A)j CAIN, LEROY E. (J5C-DA8) 

(NASA)j HAN _EY, JE=FREY M. (JEFF) (JSC-DAG) (NASA)? AUSTIN, BRYAN P. (JSC-DA3) (NASA) 

Sufcj act: IN =0 : Pdss ib a RAO Ex Ent QuEstion 

Rick and Willie, 

You guys are doing a fantastic job staying on the timeline and accomplishing great science. Keep up 
the good work and let us know if there is anything that we can do better from an MCC/POCC stand- 
point. 

There is one item that I would like to make you aware of for the upcoming RAO event on Blue FD 
10 and for future PAO events later in the mission. This item is not even worth mentioning other than 
wanting to make sure that you are not surprised by it in a question from a reporter. 

During ascent at approximately 8G seconds, photo analysis shows that some debris from the area of 
the -Y ET Bipod Attach Point came loose and subsequently impacted the orb iter left wing, in the area 
of transition from Chine to Main Wing, creating a shower of smaller particles. The impact appears 
to be totally on the lower surface and no particles are seen to traverse over the upper surface of the 
wing. Experts have reviewed the high speed photography and there is no concern for RCC or tile 
damage. We have seen this same phenomenon on several other flights and there is absolutely no 
concern for entry. 

That is all tor now. It's a pleasure working with you every day. 


fMCC/POCC=Mission Confruf Cefifer/Paylbad OpErafions Cojifr". 1 C Enter, PAO=Pubfic Affairs OffitE/; FD ]0=FJfgJTf Day 
Ten, -Y=feft ET=IEjderncif TnnfeJ 
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Columbia, STS- 107 


® Organizational Causes (Conditions) 

• Rationalization of Danger 

• Barriers to Communication, Stifled Professional 
Differences of Opinion 

• Informal Chain of Command and Decision-making 
Outside Program Rules 

• Reliance on Past Success as a Substitute for Sound 
Engineering Practices (Reduced Testing) 
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Columbia, STS- 107 

® Organizational Causes (Conditions) 

• Ineffective Checks and Balances 

• Lack of Independent Safety Program 

• Lack of Integrated Management 

• Not a Learning Organization 

• Intense Self-imposed Schedule Pressure 

• Attempted to Realize Efficiencies under Resource 
Constraints 

• Fluctuating Priorities 

• Lack of an Agreed National Vision 




Columbia, STS- 107 


® NASA Culture at the Time of Columbia? 

• "...if there was severe damage to the tiles, nothing 
could be done." 

NASA's Thermal Protection System (tile) Expert 


• "it [imaging] was no longer being pursued since even 
if we saw something, we couldn't do anything about 
it. The Program didn't want to spend the resources." 

NASA's Mission Management Team Chair 




Columbia, STS- 107 

® Columbia Accident Investigation Board: 


• "Based on NASA's history of ignoring external 
recommendations, or making improvements that 
atrophy with time, the Board has no confidence that 
the Space Shuttle can be safely operated for more 
that a few years based solely on renewed post- 
accident vigilance." 




NASA’s Human Spaceflight Challenge 


® Highly advanced leading edge technology 

• Tremendous energy required to accelerate 100 tons to orbital velocity 

• Difficult to manage this advanced technology 

• System is large, complex, unpredictable, cannot test everything, 
cannot foresee all possible environments (unknown risks) 

® High Visibility 

• Intense media coverage and public interest 

® Organizational Complexity/Size/Diversity 

• High number of decisions and people involved per event 

• Independent safety 
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Conclusion 


® Can we learn from the past? 

® Are we a learning organization? 

® How can we improve our organization? 





Space Disasters 


As of 2004, Space disasters during operations or training have killed 18 
astronauts and 4 cosmonauts (5% of all people who have been in space, 
2% per flight) and a much larger number of ground crew. 

As of November 2004, 439 individuals have flown on space flights. 
Twenty two have died while in space craft: Apollo 1 (3), Soyuz 1 (1), X~ 
15~3 (1), Soyuz 11 (3), Challenger (7), Columbia (7), totaling 18 
astronauts (4.1%) and 4 cosmonauts (.9%) of all the people launched. 

If Apollo 1 and X-15-3 are excluded; 4% (or 18) of the 437 have died 
while on a spaceflight. This excludes Gus Grissom, Ed White, Roger 
Chaffee, and Michael J. Adams from the killed total and Chaffee and 
Adams from the space flight total. 


from Wikipedia 



Apollo 1 (AS-204) Fire 

® Probable Cause 

• No single ignition source conclusively identified 

• Evidence of several electrical arcs found 

® Physical Environment 

• Extremely hazardous : 100 % Oxygen, 16.7 psi, many types of 
combustible materials 

• Deficiencies in design, manufacture, installation, rework and 
quality control existed in the electrical wiring 

• Improper emergency procedures, no escape procedures for fire, 
heavy Exit Hatch (1 min. manual operation) 

• Emergency fire, rescue and medical teams were not in attendance 
(not labeled hazardous test) 




After the Challenger disaster, both official investigations decried 
the competitive pressures and economic scarcity that had 
politicized the space agency, asserting that goals and resources 
must be brought into alignment. Steps were taken to assure 
that this happened. But at this writing, that supportive political 
environment has changed. NASA is again experiencing the 
economic strain that prevailed at the time of the disaster. Few 
of the people in top NASA administrative positions exposed to 
the lessons of the Challenger tragedy are still there. The new 
leaders stress safety, but they are fighting for dollars and 
making budget cuts. History repeats, as economy and 
production are again priorities. (Vaughan, 1996: 422) 
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Table 1. The Physical Cause and Organizational Cause of the Columbia accident as determined by the CAIB. 


Physical Cause and Organization Cause 


CAIB Report Chapter 3, p49: 

The physical cause of the loss of Columbia and its crew was a breach in the Thermal Protection 
System on the leading edge of the left wing. The breach was initiated by a piece of insulating 
foam that separated from the left bipod ramp of the External Tank and struck the wing in the 
vicinity of the lower half of Reinforced Carbon-Carbon panel 8 at 81 .9 seconds after launch. 
During re-entry, this breach in the Thermal Protection System allowed superheated air to 
penetrate the leading-edge insulation and progressively melt the aluminum structure of the left 
wing, resulting in a weakening of the structure until increasing aerodynamic forces caused loss of 
control, failure of the wing, and breakup of the Orbiter. 

CAIB Report Chapter 7, pi 77: 

The organization causes of this accident are rooted in the Space Shuttle Program’s history and 
culture, including the original compromises that were required to gain approval for the Shuttle 
Program, subsequent years of resource constraints, fluctuating priorities, schedule pressure, 
mischaracterizations of the Shuttle as operational rather than developmental, and lack of an 
agreed national vision. Cultural traits and organizational practices detrimental to safety were 
allowed to develop, including: reliance on past success as a substitute for sound engineering 
practices (such as testing to understand why systems were not performing in accordance with 
requirements/specifications); organizational barriers that prevented effective communication of 
critical safety information and stifled professional differences of opinion; lack of integrated 
management across program elements; and the evolution of an information chain of command 
and decision-making processes that operated outside the organization’s rules. 
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By time of STS-107 Foam Loss was Regarded as “In-Family” 


Definitions 

In Family: A reportable problem that was p re T d on sly expe ri - 
erced, analyzed, and understood. Out o: limits performance 
or discrepancies that have been previously experienced may 
be considered as. in- family when specifically 7 approved by the 
Space Shuttle Pro gram or desi er projecl. 5 

Out of Family : Operation or perfonncn.ee outside the ex- 
pected performance ranee for a eiven parameter or which has 
not pre viously been experienced. 9 

Accepted Risk: The threat associated with a specific cir- 
cumstance is known and understood, cannot be completely 7 
eliminated, and the circumstance (s) p red urine that threat is 
considered unlikely to reoccur. Hence, the c ire nuisance is 
fullv known and is considered a toleratle threat to the con- 
duct of a Shuttle mission. 

Safetv-of-Fli^ht-Tssue The threat associated with a 
specific circumstance is known and understood and doe s not 
po5e a threat to the crew anJ.or vehicle. 
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Poor Communication via Charts 
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High Reliability Organizations 


0 Unexpected Events 

• High Reliability Organizations find significant 
meaning in Weak Signals 

o They notice unexpected events - more, sooner, 
smaller 

o They concentrate more fully on the discrepancy, 
meaning, and resolution 

o "Don't give weak response to Weak Signals" 




Intuition 

® Significance of Intuition 

• Intuition is a powerful internal resource and a gift 
that humans have 

o (Retention of knowledge is a skill) 

• Intuition is always a response to something 

o Everything it communicates to you is meaningful 
(although it may occasionally send out a signal that is 
less than urgent) 

• Intuition is a cornerstone of personal safety & 
Mission safety 
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Accident signals 

® Every accident gives signals before it becomes 
an accident 

• Anomalies 

• Words, data, or charts in meetings 

• Weak signals 

• Ephemeral signals 

® Small errors accumulate 

® Failure set in motion from beginning 

® Growing apprehension encourages methods of 
decision making that make failure even more 
likely, then inevitable 

® Develop sensors to detect signals 




